METHOD OF GENERATING SYNTHETIC KEY FRAME AND VIDEO 
BROWSING SYSTEM USING THE SAME 

BACKGROUND OF THE INVENTION 

1 . Field of the Invention 

The present invention relates to a content-based multimedia searching 
system and, more particularly, to a synthetic key frame generating method 
capable of displaying lots of information on a screen with a fixed size and a 
video browsing system using thereof. 

2. Description of the Related Art 

With the development of image/video processing technologies in recent 
years, users can search/filter and browse a desired part of a desired video 
contents (or moving picture, for example, movie, drama, documentary program, 
etc.) at a desired time. 

A basic technique for non-linear video browsing or searching includes 
shot segmentation and shot clustering. These techniques are used for 
analyzing and searching or browsing multimedia contents. 

In the image/video processing technologies, a shot is a sequence of 
video frames obtained by one camera without interruption, which is a basic 
unit for constructing and analyzing a video. A scene is a constituent element 
meaningful in the video, that is, significant element in the development of story. 
One scene includes a number of shots. 

Meanwhile, a video indexing system structurally analyses video 
contents and detects shots and scenes using a shot segmentation engine and 
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a shot clustering engine. The video indexing system also extracts key frames 
or key regions capable of representing a segment based on the detected shots 
and scenes, and provides a tool for summarizing the video stream or directly 
moving to a desired position in the video stream. 

F^hG. 1 shows structural information of a general video stream. Referring 
to FIG. 1, a\ideo stream is consist of a series of scene that is a logical story 
unit regardless oKvideo genre, each scene is composed of a plurality of sub- 
scenes or shots, and e^ch shot is composed of sequence of frames. 

Most video indexing systems extract shots from the video stream and 
detect scenes based on the extracted shots, thereby indexing structural 
information of the video stream. That is, the video indexing system extracts a 
key frame (a video frame extracted from the video stream in order to represent 
a unit segment well) or key region, ~ and index data for 
summarizing/searching/browsing video contents. 

FIG. 2 shows the relationship between an anchor frame and a key 
region in a news content according to a prior art. A news icon in the anchor 
frame F-an consisting of a image or characters for summarizing a news 
segment represents contents of anchor shot or corresponding news article. 
When it is selected as a key region Reg-k, it is a component representing the 
corresponding segment. That is, the key region Reg-k means a region which is 
capable of concisely representing contents of a particular segment such as 
text, human face, news icon. 

FIG. 3 shows a conventional non-linear video browsing interface which 
includes a video reproduction view V-VD, a key frame view V-Fk displaying 
one-dimensionally key frames representing each shot or each scene, and a 
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tree-shaped table of content (TOC) view V-TOC for directly providing structural 
information of a video stream to users. Here, each of nodes (ND) of the tree- 
shaped TOC is a shot and scene representing contents included lower trees 
and it means a key frame. Accordingly, the interface allows a user to be able to 
5 easily move to a desired part of a video or to select and browse a desired part 
in the video stream without watching the whole content. 

However, the above-described conventional video browsing system that 
represents; partial sequences by the key frames or key regions to 
index/summarize/browse the video has the following problems. 
II 10 1) The conventional system cannot display relatively lots of information 

on a screen having a fixed size. The conventional key frames and key regions 
using in the non-linear video browsing system and in the universal multimedia 
access applications (UMA) are used as means for transmitting the summarized 
'Q content of a video stream to users through images. However, the users cannot 

US 

01 15 grasp the whole contents of the video stream through the key frames or key 
!«= regions in small numbers, displayed on the screen having a fixed size. One shot 

includes video frames displayed for several to tens seconds and a scene is 
configured of shots although it depends on the genres or characteristics of 
programs included in the video. In case of a shot that is long or severely 
20 variable, thus, one key frame is not appropriate for representing this shot. 
Accordingly, multiple key frames should be set for one shot or scene. 

Furthermore, in case where relatively large numbers of key frames are 
provided to a TV or potable terminal thatcannot display a lot of key frames on a 
screen with a fixed size at a time in order to represent the whole contents of 
25 shot and/or scene, the user should operate his/her input device many times 
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because he/she has to browse the lots of key frames. The number of the key 
frames may be reduced to solve this problem. In this case, however, key frames 
in small numbers cannot represent the content of the video stream, as 
described above. Accordingly, there is required an efficient user interface 
5 capable of displaying lots of information on a screen with a fixed size. 

2) It is difficult that the content of a scene including shots or sub-scenes 
is selected as one key frame. That is, generally, it is difficult to select a key 
frame concisely representing contents of a scene. 

Accordingly, there is needed a new method of summarizing a video 

C) " ■ 

41 10 stream having a hierarchical structure to allow key frames of upper structures to 

* Or- 
CS satisfactorily reflect contents included in lower structures. 

■ii 

- SUMMARY OF THE INVENTION 

!*j It is, therefore, an object of the present invention to provide a method 

- 

15 of generating a synthetic key frame, which is capable of representing lots of 
u I information on a screen with a fixed size. 

Another object of the present invention is to provide a method of 
describing a synthetic key frame logically or physically formed by combining 
key frames or key regions. 
20 Still another object of the present invention is to provide a method of 

summarizing a video hierarchically using a synthetic key frame. 

Yet another object of the present invention is to provide a video 
browsing interface using a synthetic key frame. 

A different object of the present invention is to provide a non-linear 
25 video browsing method using a synthetic key frame. 
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Another different object of the present invention is to provide a data 
managing method using a synthetic key frame. 

To accomplish the objects of the present invention, there is provided a 
method of generating a synthetic key frame, comprising the steps of: receiving a 
5 video stream from a first source and dividing it into meaningful sections; 
selecting key frame(s) or key region(s) representative of a divided section; and 
... combining the selected key frame(s) or key region(s), to generate one synthetic 
key frame. . 

To accomplish the objects of the present invention, there is provided a 

a 

41 10 method of describing synthetic key frame data, comprising the steps of: dividing 
a video stream into meaningful sections, and synthesizing a key frame or key 
region representing the content of each section into one image to generate a 
synthetic key frame; and describing a list of key frame/key region included in 
constituent elements of the synthetic key frame, 
jjj 15 To accomplish the objects of the present invention, there is also 

CI provided a method of describing synthetic key frame data, comprising the steps 

of: dividing a video stream into meaningful sections, and synthesizing a key 
frame or key region representing the content of each section into one image to 
generate a synthetic key frame; and generating a combination of key frames or 
20 key regions, or key frame and key region included in constituent elements of the 
synthetic key frame, and physically storing the combination to describe the 
synthetic key frame. 

To accomplish, the objects of the present invention, there is provided a 
hierarchical video summarizing method using a synthetic key frame, comprising 
25 the steps of: dividing a video stream into meaningful sections, and synthesizing 
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a key frame or key region representing the content of each section into one 
image to generate a synthetic key frame; and assigning the synthetic key frame 
to a key image locator, a hierarchical summary list for describing lower 
summary structures, and structural information of the video stream. 
5 To accomplish the objects of the present invention, there is provided a 

method for providing a video browsing interface, comprising the steps of: 
dividing a video stream into meaningful sections, and synthesizing a key frame 
or key region representing the content of each section into one image to 
generate a synthetic key frame; and providing a user interface to a 

.S3 4 

ij) 10 predetermined display to browse a video related with the generated synthetic 

01 ' " 
CJ key frames. 

•S3 ■ 

] &[ To accomplish the objects of the present invention, there is also 

provided a non-linear video browsing method, comprising the steps of: dividing 
H a video stream into meaningful sections, and synthesizing a key frame or key 

jj*, 15 region representing the content of each section into one image, to generate a 
u! synthetic key frame; providing a user interface to a predetermined display to 

browse a video related with the generated synthetic key frames; selecting the 
synthetic key frame according to an input by a user; and reproducing a segment 
represented by. the selected synthetic key frame. 

20 

BRIEF DESCRIPTION OF THE DRAWINGS 

A more complete appreciation of the invention, and many of the attendant 
advantages thereof, will be readily apparent as the same becomes better understood 
by reference to the following detailed description when considered in conjunction with 
25 the accompanying drawing, in which like reference symbols indicate the same or the 
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similar components, wherein: 

FIG. 1 shows structural information of a general video stream; 

FIG. 2 shows the relationship between an anchor frame and a news icon in a 
prior art; 

FIG. 3 shows a conventional non-linear video browsing interface; 

FIGS. 4A and 4B are diagrams for explaining the concept of a synthetic key 
frame according to the present invention; 

FIG.. 5A shows the description structure of a segment locator according to the 
present invention; 

FIG. 5B shows the description structure of an image locator according to the 
present invention; 

FIG. 6 shows the description structure of a.key frame locator according to the 
. present invention; 

FIG. 7 shows the description structure of a key region locator according to the 
present invention; 

FIG. 8 shows the description structure of synthetic key frame information 
according to the present invention; 

FIG. 9 shows the description structure of a layout with respect to the 
arrangement of constituent elements of a synthetic key frame according to the 
present invention; 

FIG. 10 shows the structure of a news video according to the present 
invention; 

FIG. 11 shows a synthetic key frame of news headlines according to the , 
present invention; 

FIGS. 12A and 12B show synthetic key frames of detailed news sections 



according to the present invention; 

FIGS. 13A and 13B show synthetic key frames generated from a soccer 
game video according to the present invention; 

FIG. 14 shows structural information of a video and hierarchical synthetic key 
frames according to the present invention; 

FIG. 15 shows the description structure of a hierarchical image summary 
element for hierarchical video stream summary according to the present invention; 

FIG. 16 shows a video browsing interface using a synthetic key frame 
according to the present invention; 

FIG. 17 shows an example of application of the synthetic key frame according 
to the present invention to UMA; and 

FIG. 18 is an example of a flow diagram showing a method of communicating 
information using the synthetic key frame according to the present invention, applied 
to UMA. 

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT 

Reference will now be made in detail to the preferred embodiments of 
the present invention, examples of which are illustrated in the accompanying 
drawings. 

FIGS. 4A and 4B are diagrams for explaining the concept of a synthetic key 
frame according to the present invention. Referring to FIG. 4A, the synthetic key 
frame according to the invention is generated by combining key frames or key 
regions Reg-k from frames Fl, Fm, Fn which are extracted at predetermined points of 
time tl, tm, tn within one segment Sgti when a video stream is divided into 
predetermined numbers of segments Sgt1, Sgt2,...., Sgti, Sgti+1. Referring to FIG. 
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4B, the synthetic key frame of the invention is generated by combining key frames or 
key regions Reg-k from frames Fo, Fp, Fq, Fr extracted at predetermined points of 
time to, tp, tq, tr within one segment Sgtj+1 and external frames Fext supplied from 
an external source when a video stream is divided into predetermined numbers of 
segments Sgt1, Sgt2,...., Sgtj, Sgtj*i; 

The synthetic key frame of the invention, different from the key frame in the 
prior art, is not a frame which has been physically generated in the video stream 
because it is created by combining regions having meaningful information or key 
frames in order to represent a specific segment in the video stream. 

K3S. 5A and 5B respectively show description structures of a segment 
locator and an image locator according to the present invention. Referring to FIG. 5A, 
the segment locator as a means for designating a segment in a video stream, 
inlcudes segment IDl Media URL or actual segment data for desiganting the audio- 
visual segment, and segment time information such as segment starting/ending time 
or length, description inf oration for annotation for the segment, and a related 
segment list. 

Here, the related segment list is used for representing description of 
abstract/detail, cause/result relation among segments, and components of the list 
include variables such as the segment locator or an identifier for referring to the 
20 segment locator. 

Referring to FIG. 5B, the image locator as a means for designating an image 
includes inherent ID, image URL, or image data for designating the image. The 
image locator can\havea structure which is capable of describing information such as 
an image related segment list and annotation. 
25 FIG. 6 shows the description structure of a key frame locator according to the 
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present invention. As shown in FIG. 6, the key frame locator includes an image 
locator, additionally, a representative segment locator for indicating which segment is 
represented by corresponding key frame, and fidelity values for indicating how 
faithfully corresponding segment is represented. 

FIG. 7 shows the description structure of a key region locator according to the 
present invention, which is a logical or physical key region description structure. 

The logical key region description structure includes an ID, an image locator, 
and region area information corresponding to a key region of an image designated by 
the image locator. It additionally includes a representative segment locator for 
indicating which segment is represented by the corresponding key region, fidelity , 
values for indicating how faithfully the key region represents corresponding segment, 
description information for other annotations and a related segment list for 
designating segment related with the key region. This logical key region description 
structure describes the key region using metadata. 

The physical key region description structure includes an inherent ID, region 
data, a representative segment locator for indicating which segment is represented 
by corresponding key region if required, fidelity, description and a related segment 
list. For the video browsing interface using the synthetic key frame according to the 
present invention, the synthetic key frame must have been physically generated or 
be logically described in a content-based data region with respect to a video stream. 

FIG. 8 shows the description structure of synthetic key frame information 
according to the present invention, which has a logical description structure and a 
physical description structure. 

As shown in FIG. 8, the logical synthetic key frame description structure 
includes variables such as an ID, a representative segment locator for designating a 



10 



segment represented by the synthetic key frame, a key frame list and a key region 
list that are constituent elements of the synthetic key frame, fidelity for indicating how 
faithfully the synthetic key frame represents the segment, and layout information for 
indicating the arrangement state of constituent elements of the synthetic key frame. 

The physical synthetic keyframe description structure includes variables such 
as an ID, an image locator for designating the actual synthetic key frame, a 
representative segment locator for designating a segment represented by the 
synthetic key frame, fidelity for indicating how faithfully the synthetic key frame 
represents the segment, a key region list related with the synthetic key frame, and 
layout information for indicating the arrangement state of constituent elements of the 
synthetic key frame. 

Here, key frame elements constructing the key frame list include a key frame 
locator for designating a corresponding key frame and fidelity for indicating how 
important meaningful, information the corresponding key frame represents in the 
synthetic key frame structure, as shown in FIG. 8. Furthermore, key region elements 
constructing the key region list include a. key region locator for designating a 
corresponding key region and fidelity information for indicating how important 
meaningful information the corresponding key region represents in the synthetic key 
frame structure. The fidelity can be extracted automatically or manually. The fidelity 
automatically extracted is obtained with regard to information like duration of the key 
region, the size of an object, audio, etc. and a matching level of these information 
items. 

FIG. 9 shows the description structure of layout information with respect to 
the arrangement of constituent elements of the synthetic key frame according to the 
present invention. This description structure is represented by a markup language 
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such as HTML and XML. Because the constituent elements of the synthetic key 
frame may be arranged, being overlapped, the layout description structure includes 
layer information about the first layer (layer=0), the second layer (layer=1) and so on, 
and information about a location where the keyframe or key region contained in each 
5 layer is displayed or to be displayed on a screen. 

There will be explained an example of application of the synthetic key frame 
structure and synthetic key frame generating method according to the invention to a 
broadcasting program. 

CI " 

! ll 10 A) Synthetic key frame generated from a news video 
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'* f l FIG. 10 shows the structure of the news video according to the present 

*« ? 

t q 

invention. The news video. is generally configured of a headline news section NS-HL, 

C! ' 

j^j a detailed news section NS-DT, a summary news section and a weather/sports 

EI 
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Ql 15 section. A commercial advertisement section may be added thereto. Each of these 

.« - _ - 

u . . . . 

U sections further includes sub-sections. The section corresponds to a scene in the 

video stream structure. For example, the headline news section NS-HL may be 
divided into headline items HL-it and the detailed news section NS-DT may be 
classified into news items DT-it. Here, the items can be formed of key frames. Each 
20 news item DT-it is basically divided into an anchor scene Scn-an and an episode 
scene Scn-ep. 

FIG. 11 shows an example of a process of generating the synthetic key frame 
of headline news section NS-HL according to the present invention. 

The headline news section NS-HL is constructed of five headline items HL-it. 
25 These headline items are configured of twenty-three shots and the running time is 59 
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seconds, approximately. The five headline items are summarized using key frames 

F1, F2, F3, F4 and F5 extracted at points of time t1, t2, t3, t4 and t5, respectively. 

Accordingly, one synthetic key frame Fsk according to the present invention is 

created in a manner that key regions Reg1, Reg2, Reg3, Reg4 and Reg5, configured 

5 of texts, are extracted from the key frames F1, F2, F3, F4 and F5 to be combined. 

The synthetic key frame can display the whole contents of the headline news section 

NS-HL on a screen with a fixed size at. a time. 

On the contrary, the conventional video indexing system should select several 

key frames representing the headline news section, for example, because it assigns 

4} 10 at least one key frame to an individual shot or scene. Furthermore, it cannot display 
Q) 

O an entire contents of headline section on a screen at a time. 

£\ FIGS. 12A and 12B show synthetic key frames of detailed news sections 

ill 

! * r according to the present invention. FIG. 12A illustrates a synthetic key frame Fsk 
formed from one news item NS-it that is constructed of twenty-one shots and fifty- 

q!} 15 seven seconds long, and FIG. 12B illustrates a synthetic key frame Fsk extracted 

r| - . . . 

from one news item NS-it that is constructed of twenty-one shots and one-hundred- 
seven seconds long. That is, the synthetic key frames corresponding to news items 
of a news program can be differently formed. Where the synthetic key frames are 
arranged or allocated to corresponding nodes in the TOC interface, the contents of 
20 lower structures of the TOC interface can be displayed at a time. On the contrary, the 
conventional video indexing system should extract lots of key frames for a single 
news item and it cannot display these key frames on a screen at the same time. 

B) Synthetic key frame generated from a sports video 
25 Other than news, it is necessary to summarize streams base on segment- 
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based summary in sports news. For example, soccer video is configured of great 
numbers of video frames so that the running time is long. To summarize the soccer 
video, accordingly, one shot should be represented by lots of key frames and one 
key frame is difficult to represent a scene constructed of shots. 
5 FIGS. 13A and 13B show synthetic key frames generated from the soccer 

game video according to the present invention. 

FIG. 13A illustrates a synthetic key frame Fsk generated from one scene 
constructed of nine shots whose running time is sixty-five seconds, and FIG. 13B 
illustrates a synthetic key frame Fsk generated from one scene constructed of nine 

a . ■ . ■ 

41 10 shots whose running time is fifty-three seconds. 

m 

j« J , Though the shots included in one scene have different contents, the synthetic 

|*j key frame Fsk according to the present invention can present an image- combining 

* r key frames or key regions representing the entire contents of the scene without 

j*] selecting a key frame representing a scene. Therefore, the synthetic key frame Fsk 

a 

m 15 can summarize the entire contents of the scene. 

CI" ■ ' " - 

!»* The synthetic key frame of the present invention can be generated using the 

key frame or key region for entertainment, documentary, talk show, education, 
advertisement and home shopping programs as well as the news and sports video 
described above with reference to FIGS. 11, 12A, 12B, 13Aand 13B. 

20 Meantime, if arrangement information of constituent elements of the synthetic 

key frame, such as key regions or key frames, is described in the description, a user 
is able to not only browse corresponding video using the synthetic key frame but also 
perform non-linear video browsing using the constituent elements. Since the 
synthetic key frame shown in FIG. 11, for example, is generated by combining the 

25 key regions Reg1 , Reg2, Reg3, Reg4 and Reg5 of the key frames extracted from the 
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headline news section, the user selects a key region (Reg1, for instance) of the 



news item corresponding to the selected key region. 

FIG. 14 shows structural information of a video stream and a synthetic key 
5 frame that hierarchically summarizes the structural information in accordance with the 
present invention. In FIG. 14, nodes correspond to frames representative of a 
program, shot and scene. Nodes Na, Nb, Nc and Nd that are synthetic key frames 
that represent contents of lower level. To summarize lower structures, key regions or 
key frames of the lower level can be used for the synthetic key frames of upper 
; ll 10 structures. Accordingly, the user can search/browse a video stream using a 

m 

O hierarchical structure of video at a desired level and the synthetic key frames. If one 



key frame or key region is sleeted for nodes Na, Nb, Nc and Nd, a user can not fully 
understand the lower structure and content without browsing the lower level. But with 
synthetic key frame, user can easily understand the structure and content of the 



Hierarchical image summary elements must be defined in order to summarize 
the video stream with the hierarchical structure. FIG. 15 shows the description 
structure of the hierarchical image summary element for hierarchical video stream 
summary according to the present invention. The description structure of the 

20 hierarchical image summary element, which is a recursive structure, includes 
variables such as a key image locator, a list of sub-hierarchical image summary 
elements, summary level information and fidelity indicating how faithfully 
corresponding synthetic key frame represents the lower structures. Here, the key 
image locator is a data structure capable of designating a key frame, key region and 

25 synthetic key frame, and the list of sub-hierarchical image summary elements 



synthetic key frame so that he/she can browse a headline news item or detailed 




lower level without esxplicit browsing of the lower level. 



15 



f *M 



. 3h 



CI 



describes a lower summary structure, each element of the list being a hierarchical 
image summary element. For example, when the number of the elements of the list 
of sub-hierarchical image summary elements is '0\ it corresponds to the lowest 
node(leaf node) and means there does not exist a lower summary element any more. 

FIG. 16 shows a non-linear video browsing interface example using the 
synthetic key frame according to the present invention. The video browsing interface 
includes a video display view V-VD, a key frame/key region view V-Fk/Reg, and a 
synthetic key frame view V-Fsk. The video display view V-VD and the key frame/key 
region view V-Fk/Reg are the same functions as those of the general non-linear 



4} 10 video browsing interface shown in FIG. 3. The synthetic key frame view V-Fsk 

m 

£jf displays a video summary on a screen using the synthetic key frame such that the 

O 

^; user can select the synthetic key frame or the key frame or key region included in the 



synthetic key frame to easily move to the section corresponding to the key frame or 
yj key region. The synthetic key frame view V-Fsk may be displayed one-dimensionally, 
131 15 as shown in FIG. 16, or displayed in a TOC-shaped tree structure. 



Meanwhile, the synthetic key frame according to the present invention can be 
applied to UMA application. Here, the UMA is an apparatus having improved 
information transmission performance, which can process any of multimedia 
information into a form most suitable for a user environment, being adapted to a 
20 variety of variations in the user environment, to allow a user to be able to 
conveniently use . the information. Specifically, the user can obtain only limited 
information based on his/her terminal or a network environment connecting the 
terminal to a server. For instance, the device the user uses may not support motion 
pictures but still images, or it may not support video but audio. In addition, on the 
25 basis of network connection method/medium, there is a limit in the amount of data 
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capable of being transmitted to the user's device within a predetermined period of 
time because of insufficiency in the transmission capacity of data delivered through 
the network. The UMA converts and transmits a video stream to a user who cannot 
receive and display the video stream due to restriction conditions of the 
5 device/network, using reduced numbers of key frames with a decreased size within 
the user environment. By doing so, the UMA can help the user to understand 
contents included in the video stream. 

By being applied to the UMA, the synthetic key frame of the invention can be 
used as a means for providing a lot of meaningful information while reducing the 

□ 

4i 10 number of the key frames to be transmitted to decrease the amount of data to be 

Ojj 

^ delivered. 

^; FIG. 17 shows an example of application of the synthetic key frame according 

to the present invention to the UMA. This application includes a server S generating 
T s i the synthetic key frame according to the present invention, and a terminal T for 

ijj 15 receiving the synthetic key frame from the server S and transmitting a predetermined 

.S3* 

;»k request signal to the server. As described above, the synthetic key frame Fsk 

consists of texts, key regions and key frames. 

FIG. 18 is a flow diagram showing a method of receiving information using the 
synthetic key frame according to the present invention, which is applied to UMA. 

20 Referring to FIG. 18, when the synthetic key frame Fsk is sent from the server S to 
the user's terminal T, the user selects the synthetic key frame or a component thereof, 
corresponding to a part he/she wants to browse, and then requests the server to 
deliver audio of corresponding part (ST1). When the server S sends the audio to the 
user, the user receives the audio and, when it is not the information he/she wants, 

25 does not browse the contents included in the synthetic key frame any more. However, 
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if he/she wants to more information, he/she requests more key frames with respect to 
corresponding section (ST2). By doing so, the user can browse the contents of the 
synthetic key frame more and he/she can also request the video to browse video 
streams (ST3). 

5 In case where the synthetic key frame is applied to the UMA, the user can 

select a desired part and easily browse it so that he can save communication cost. 
Furthermore, the server can easily transmit information about the contents of 
multimedia stream to even a device with a limited function. 

As described above, the synthetic key frame of the present invention is 

4V 10 generated by combining key frames or key regions to represent a specific section or 

y i 

G| segment of a video stream, thereby displaying lots of information on limited device. 

~* Moreover, the synthetic key frame can summarize a video stream one-dimensionally 

. f % 

** e or hierarchically and it can be used as a means for non-linear video browsing. In 

addition, the synthetic key frame of the invention can be effectively applied to UMA 
m 15 with a limited performance of a terminal or transmitting device, and it can be also 
\*k applied to all of the video genres. The video summarizing method using the synthetic 

key frame of the invention can efficiently summarize the content of a video because it 
can sufficiently display the content of shots or scenes on a screen with a fixed size 
using the synthetic key frame. 
20 Although specific embodiments including the preferred embodiment 

have been illustrated' and described, it will be obvious to those skilled in the art 
that various modifications may be made without departing from the spirit and 
scope of the present invention, which is intended to be limited solely by the 
appended claims. 

25 
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